An efficient parallel implementation of the MSPAI preconditioner

نویسندگان

  • Thomas Huckle
  • A. Kallischko
  • A. Roy
  • Matous Sedlacek
  • Tobias Weinzierl
چکیده

We present an efficient implementation of the Modified SParse Approximate Inverse (MSPAI) preconditioner. MSPAI generalizes the class of preconditioners based on Frobenius norm minimizations, the class of modified preconditioners such as MILU, as well as interface probing techniques in domain decomposition: it adds probing constraints to the basic SPAI formulation, and one can thus optimize the preconditioner relative to certain subspaces. We demonstrate MSPAI’s qualities for iterative regularization problems arising from image deblurring. Such applications demand for a fast and parallel preconditioner realization. We present such an implementation introducing two new optimization techniques: First, we avoid redundant calculations using a dictionary. Second, our implementation reduces the runtime spent on the most demanding numerical parts as the code switches to sparse QR decomposition methods wherever profitable. The optimized code runs in parallel with a dynamic load balancing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modified Sparse Approximate Inverses (MSPAI) for Parallel Preconditioning

The solution of large sparse and ill-conditioned systems of linear equations is a central task in numerical linear algebra. Such systems arise from many applications like the discretization of partial differential equations or image restoration. Herefore, Gaussian elimination or other classical direct solvers can not be used since the dimension of the underlying coefficient matrices is too larg...

متن کامل

Efficient implementation of low time complexity and pipelined bit-parallel polynomial basis multiplier over binary finite fields

This paper presents two efficient implementations of fast and pipelined bit-parallel polynomial basis multipliers over GF (2m) by irreducible pentanomials and trinomials. The architecture of the first multiplier is based on a parallel and independent computation of powers of the polynomial variable. In the second structure only even powers of the polynomial variable are used. The par...

متن کامل

Accelerating CS in Parallel Imaging Reconstructions Using an Efficient and Effective Circulant Preconditioner

Purpose: Design of a preconditioner for fast and efficient parallel imaging and compressed sensing reconstructions. Theory: Parallel imaging and compressed sensing reconstructions become time consuming when the problem size or the number of coils is large, due to the large linear system of equations that has to be solved in l1 and l2-norm based reconstruction algorithms. Such linear systems can...

متن کامل

On a Two-Level Parallel MIC(0) Preconditioning of Crouzeix-Raviart Non-conforming FEM Systems

In this paper we analyze a two-level preconditioner for finite element systems arising in approximations of second order elliptic boundary value problems by Crouzeix-Raviart non-conforming triangular linear elements. This study is focused on the efficient implementation of the modified incomplete LU factorization MIC(0) as a preconditioner in the PCG iterative method for the linear algebraic sy...

متن کامل

Preconditioning and Parallel Implementation of Implicit Runge-kutta Methods

A major problem in obtaining an efficient implementation of fully implicit RungeKutta (IRK) methods applied to systems of differential equations is to solve the underlying systems of nonlinear equations. Their solution is usually obtained by application of modified Newton iterations with an approximate Jacobian matrix. The systems of linear equations of the modified Newton method can actually b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Parallel Computing

دوره 36  شماره 

صفحات  -

تاریخ انتشار 2010